-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add updated span events + trace more methods #1259
feat: add updated span events + trace more methods #1259
Conversation
b0d9a6c
to
f437955
Compare
@sakthivelmanii @harshachinta kindly help me run the bots. Thank you |
# Empty context manager. Users will have to check if the generated value is None or a span | ||
yield None | ||
return | ||
|
||
if session: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odeke-em
We should move this latest change of updating session last use time to beginning of this function, otherwise when opentlemetry is not installed then that update will never happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odeke-em
this is not yet addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed that @harshachinta as it is from the bigger PR and I don't think we shall need it anymore once some tests are sent in the other PR. I deleted it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not from your PR, this was added by us to handle one case. Don't remove that
I am talking about this code. We need this to know when was a session last used to avoid making a backend call to verify session existence.
if session:
session._last_use_time = datetime.now()
google/cloud/spanner_v1/batch.py
Outdated
@@ -70,6 +73,10 @@ def insert(self, table, columns, values): | |||
:param values: Values to be modified. | |||
""" | |||
self._mutations.append(Mutation(insert=_make_write_pb(table, columns, values))) | |||
add_event_on_current_span( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this requirement was bought out when we met last time.
Thinking out loud.
Cloud spanner has a mutation limit of 40000 in one request.
So lets say a customer adds mutations in a loop, then this will print 40,000 events on the span.
Is this a good user experience? Also this could add up to the storage cost on customer end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great question but it won't be a problem because:
- spans have a default event limit of 128 https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/#span-limits
- storage costs not really as pricing on telemetry data is by span count rather than span data count unless perhps they have their own storage system but that's unlikely for those who'd be storing tens of thousands of span events
What we can do is perhaps document how to change span event limits and also document that we add events for span mutations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sakthivelmanii
I’d like to get your thoughts on this.
In my opinion, adding a separate event for every mutation could clutter the trace spans with too many events. Instead, I’d suggest avoiding one event per mutation. We already track the num_mutations attribute in the Commit request (here), which effectively provides customers with insights into the number of mutations in a commit.
Let me know what you think.
tests/system/test_session_api.py
Outdated
"CloudSpanner.CreateSession", | ||
"CloudSpanner.Transaction.execute_streaming_sql", | ||
"CloudSpanner.Transaction.execute_streaming_sql", | ||
"CloudSpanner.Transaction.commit", | ||
"CloudSpanner.Session.run_in_transaction", | ||
"CloudSpanner.Database.run_in_transaction", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some mistake in ordering here. Can you check?
Ideally the order should be,
"CloudSpanner.Database.run_in_transaction",
"CloudSpanner.CreateSession",
"CloudSpanner.Session.run_in_transaction",
"CloudSpanner.Transaction.execute_streaming_sql",
"CloudSpanner.Transaction.execute_streaming_sql",
"CloudSpanner.Transaction.commit",
```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't a mistake though but rather the order in which spans are ended @harshachinta.
992d6c9
to
6d6b30a
Compare
google/cloud/spanner_v1/batch.py
Outdated
@@ -70,6 +73,10 @@ def insert(self, table, columns, values): | |||
:param values: Values to be modified. | |||
""" | |||
self._mutations.append(Mutation(insert=_make_write_pb(table, columns, values))) | |||
add_event_on_current_span( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sakthivelmanii
I’d like to get your thoughts on this.
In my opinion, adding a separate event for every mutation could clutter the trace spans with too many events. Instead, I’d suggest avoiding one event per mutation. We already track the num_mutations attribute in the Commit request (here), which effectively provides customers with insights into the number of mutations in a commit.
Let me know what you think.
"exception", | ||
{ | ||
"exception.type": "IndexError", | ||
"exception.message": "pop from empty list", | ||
"exception.stacktrace": "EPHEMERAL", | ||
"exception.escaped": "False", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is there a repeation of exception event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So given that we have a direct invocation of with trace_call("testBind")
per
try:
with trace_call("testBind", fauxSession):
pool.bind(database)
except Exception:
pass
that then invokes code internally where an exception happens. With the exception re-raised and unhandled until the highest parent, OpenTelemetry records the exception even on the parent. If we swallowed the exception internally we wouldn't
20fe353
to
a69a4dc
Compare
This change carves out parts of PR googleapis#1241 in smaller pieces to ease with smaller reviews. This change adds more span events, updates important spans to make them more distinct like changing: "CloudSpanner.ReadWriteTransaction" to more direct and more pointed spans like: * CloudSpanner.Transaction.execute_streaming_sql Also added important spans: * CloudSpanner.Database.run_in_transaction * CloudSpanner.Session.run_in_transaction
…ble + address test feedback
Referencing issue googleapis#1269, this update removes adding a span event per mutation, in favour of a future TODO.
5208f8e
to
37594f7
Compare
* observability: add updated span events + traace more methods This change carves out parts of PR googleapis#1241 in smaller pieces to ease with smaller reviews. This change adds more span events, updates important spans to make them more distinct like changing: "CloudSpanner.ReadWriteTransaction" to more direct and more pointed spans like: * CloudSpanner.Transaction.execute_streaming_sql Also added important spans: * CloudSpanner.Database.run_in_transaction * CloudSpanner.Session.run_in_transaction * all: update review comments + show type for BeginTransaction + remove prints * Remove requested span event "Using Transaction" * Move attempts into try block * Transform Session.run_in_transaction retry exceptions into events * More comprehensive test for events and attributes for pool.get * Add test guards against Python3.7 for which OpenTelemetry is unavailable + address test feedback * Remove span event per mutation in favour of future TODO Referencing issue googleapis#1269, this update removes adding a span event per mutation, in favour of a future TODO. * Sort system-test.test_transaction_abort_then_retry_spans spans by create time * Delint tests
This change carves out parts of PR #1241 in smaller pieces to ease with smaller reviews.
This change adds more span events, updates important spans to make them more distinct like changing:
"CloudSpanner.ReadWriteTransaction" to more direct and more pointed spans like:
Also added important spans: